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Abstract 

This work deals with the estimation of the extreme value index and extreme quantiles for heavy tailed data, 
randomly right truncated by another heavy tailed variable. Under mild assumptions and the condition that 
the truncated variable is less heavy-tailed than the truncating variable, asymptotic normality is proved for both 
estimators. The proposed estimator of the extreme value index is an adaptation of the Hill estimator, in the 
natural form of a Lynden-Bell integral. Simulations illustrate the quality of the estimators under a variety of 
situations. 


1. Introduction 

Extreme value statistics is an active domain of research, with numerous fields of application, and which benefits 
from an important litterature in the context of i.i.d. data, dependent data, and (more recently) multivariate or 
spatial data. For univariate data, semiparametric estimation of the tail of the underlying distribution (for instance, 
estimation of extreme quantiles) requires in the first place accurate estimation of the so-called extreme-value index 
(e.v.i.). In the recent years, several authors dedicated their efforts to obtaining good estimations of the e.v.i. for 
incompletely observed data, i.e. randomly censored or truncated data (note here that, since the interest generally 
lies in the evaluation of the upper tail of the data, left censoring or left truncation is not a relevant framework, 
and therefore censoring or truncating are considered from the right). In those contexts, the usual estimators of the 
e.v.i. need some modifications because otherwise they would lead to erroneous estimations when blindly applied 
to censored or truncated data. Some references for extreme value estimation in the context of randomly censored 
observations are m, m, [a- 

The first published work on extreme values estimation under random truncation was written by L.Gardes and 
G.Stupfler [S], who dealt with heavy-tailed right truncated data (in their work, they provided motivations and many 
references on main existing results about truncated samples, we refer to [5] in this regard). The framework of 
randomly right truncated data will be precisely defined in the next section, let us just sketch it for the moment : we 
consider n independent i.i.d. couples and, among those couples, we only observe those couples which 

satisfy the condition Xi ^ Yi. The actually observed data will then be noted ((-^f, T-*))isji^„. Below, F and G will 
stand for the respective distributions of X and Y, whereas F* and G* will stand for the conditional distributions of 
X and Y given that X ^Y : the latter two are therefore the distributions of the observed samples (A*)i<i<„ and 
The first objective is to estimate the e.v.i. of X. 

The original idea in was to notice that the extreme value indices 7* and 72 of F* and G* are related by a 
very simple relation to those of F and G, 71 and 72 : they proved that we have indeed (when both F and G are 
heavy-tailed) 

7i = 7172/(71 + 72) and 7I = 72. 
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These relations readily yield a proposition of estimator for the parameter of interest 71 by relying on usual Hill 
estimators of yf and 7^ : 


7GS = 


7l*(fcl)72(fc2) 

72 (^ 2 ) - itih) 


where 7 i (*i) = ^ 2 


2=1 


^n—z+l,n 
n—ki ,n 


and 72(^2) = ^ 2 (1) 

^2 ^n-k2,n 


where ^ ^ -^*,n ^i*n < • ■ ■ < denote the usual order statistics of both samples, and ki and k2 are 

the number of upper observations which are kept for estimating 7* and 7^ ■ 

The authors of also investigated the behavior of an estimator of F in the upper tail, and therefore provided 
a Weissman-type estimator of extreme quantiles in this truncation context and proved its asymptotic normality. 
However, their results suffer from some kind of calibration problem, since they are proved only under the condition 
that one of the numbers ki and k2 of order statistics used for estimating 7* and 72 must grow to infinity faster than 
the other. The question of getting rid of this restriction was addressed in the prepublication [ 2 ]. 

In this work, we consider the same framework of randomly right-truncated heavy-tailed data, but adopt a new 
method for defining an estimator of the extreme value index 71 of the truncated sample : in Section 2 , this estimator 
7„ is defined as some Lynden-Bell integral, requiring a single threshold to be chosen, and asymptotic normality is 
proved for 7„ as well as for an estimator of extreme quantiles, under appropriate but mild conditions. Section 3 is 
devoted to a simulation study illustrating the performance of the defined estimators (with a tentative comparison 
to the performance of the estimator defined in m), and Sections 4 and 5 respectively contain a conclusion and the 
proofs of the results. The appendix recalls important (and needed) results, previously published in the litterature, 
and contains as well a technical lemma which is repeatedly used in the proofs section. 


2. Framework and statement of the results 
2.1. Notations and definition of the estimators 

Let {{Xi, be fi independent copies of a couple {X, Y), where X and Y are positive independent random 

variables having respective cumulative distribution functions F and G. For convenience, we suppose that the lower 
endpoints of F and G are both equal to 0 (but this will have no influence on the results, since only the highest data 
values are retained for tail estimation). We assume in this work that X and Y are heavy-tailed distributed, meaning 
that 1 — F and 1 — G (also assumed to be continuous) are regularly varying with respective indices —1/71 and —1/72 
where 71 and 72 are > 0 . 

We only observe the couples {Xi,Yi) which satisfy Xi ^ Yi : in other words, the original data Xi are randomly 
truncated from the right by the Yi, and the actually observed sample is {{X*,Y*))i^i^ff, where N follows the B{h,p) 
distribution, p denoting the (unknown) probability of non-truncation p = P(X ^ Y). Consequently, the distribution 
of the X* becomes 

1 r“_ 

F*(x) = P(X ^ a;|X ^ F) = - G{t)dF{t). ( 2 ) 

P Jo 

Conditionally on iV = n, the couples {Xf,Y*),..., {Xfj, Y^) are independent and identically distributed, and X* 
is no longer independent of Y*. It is important to note that, in the sequel, we will work conditionnaly on N = n, 
where n is some deterministic sample size, and we will therefore handle the sample (Xf, F,*),..., {X*,Y*) without 
further reference to N. 

In this work, F^ will denote the classical Lynden-Bell (nonparametric maximum likelihood) estimator of F, 
namely 

Pn{x)= n ( 1- <^"( 2 ;) = 

(with the usual convention that a product on the empty set equals 1 ), where ( 7 „ is the estimator of the function G 

G(x) = P(X ^ a; ^ F|X s: F) = p-^G{x)F{x) ( 3 ) 

which plays an important role in the analysis of truncated data. Note that F„ is very close to, but different strictly 
speaking, from the estimator of F considered in (F„ takes rational values, which is not the case of the latter). 
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Our goal is to adapt the famous Hill estimator in the context of right-truncation. It is well known that (see 
Remark 1 . 2.3 in for instance) 


E 


1 r” 

[log{X/t) \X>t] = log(a;/t) dF{x) 


tends to 71 as t —» -l-oo. If (t„) is a sequence of positive thresholds growing to infinity with n, we can then define 
a random version of (j){x) = {F{t))~^ \og{x/t)lx>t by (i)n{x) = {F{tn))~^ \og{x/tn)ix>t^ and consequently, a natural 
adaptation of the Hill estimator for 71 is (see relations ( 1 . 9 ) and ( 1 - 10 ) in [? ], in the left-truncation case, for details 
about Lynden-Bell integrals) 

FniX*) 


r ^ 

% = Mx)dFr,{x) = - V ) 


Cn{xty 


which leads to 


In = ^ 




nFn{tn) \tnj CniX*) 


Xf\F„(Xf) 




( 4 ) 


Note that this principle has already been successfully applied in the censoring framework in |llj (see equation ( 7 )), 
where the role of Lynden-Bell estimator was played by the Kaplan-Meier estimator. However, here, the threshold 
tn is deterministic instead of being an order statistic. The asymptotic properties of 7„ are stated in Theorem 
Naturally, the lighter the truncation, the closer our estimator 7„ gets to the usual Hill estimator. (?) 

We will use this estimator of the tail index 71 in order to estimate an extreme quantile, following a classical 
scheme. More precisely, let (p„) be some sequence of quantiles orders tending to 0 , such that = o(F(tM). If Xp^ 
denote the quantile of F of order I — i.e. solving F{xp^) = Pn, then, in this heavy tailed context (see (m below), 
it is easy to see that we can define an estimator of Xp^ as 




— in 


Fnjtn) 

Pn 


( 5 ) 


In the situation of untruncated data, this is a classical estimator for an extreme quantile based on the approximation 
of the log relative excesses by a Pareto distribution in the heavy-tailed context, where Fn is in this case the empirical 
distribution function. 


2 . 2 . Assumptions and results 

The first order condition assumed in this work is the following 

F e and G e RV-i/.y^ with 0 < 71 < 72. (6) 

In other words, we assume that the tail of the truncating variable Y is heavier than the tail of the variable X of 
interest. This condition is needed in many occasions in the proofs of our results, and is due to the presence (in ( [d])) of 
the Lynden Bell estimator, evaluated in the tail. Note that this implies the finiteness of the integral dF{x)/G{x) 
(which is a sufficient condition sometimes stated in papers dealing with the asymptotic normality of F^). 

Moreover, if we note Ip the slowly varying function associated to F {i.e. such that F{x) = x~'^/'^^If{x)), the 
second order condition we consider is the classical SR 2 condition for Ip (see [S]), 

Va;> 0 , hp^x) g{t) (Vx > I) ( 7 ) 

where g is a positive mesurable function, slowly varying with index pi, and hp^x) = when pi < 0, or 

hp^x) = logx when pi = 0. 

The first assumption on the threshold sequence (t„) will be that, if we note H = F G (note that Ft is the 
distribution function of min(X, H)), (t„) satisfies 

nH{tn) + 00 . ( 8 ) 
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The asymptotic normality result will then require the following condition on (t„) : 

I nH{tn)g{tn) A for some A > 0 . 
Theorem 1. Under assumptions ([^, Q, ([^ and ([^, as n tends to infinity, 

'\JnH{tn){% - 7i) -^A/'(Am,s^) , 


( 9 ) 


where m = 


— if Pi < 0 , 

1—7ipi '' ' ■ 

2 


7 i 


if Pi = 0. 


and s^ = I 1 + ( ~ 


i_2i 

72 


Let us now turn to the results about the extreme quantile estimator defined in (§. Suppose that the sequence of 
quantile orders (p„), tending to 0, satisfies the condition 


F(tn)/Pn + 00. 

Theorem 2. Under (10) and the assumptions of Theorem^ setting dn = F{tn)/Pm if Pi <0 and 

inHftn) /\ogdn 00 , 


( 10 ) 


( 11 ) 


as n tends to oo then 


nH(t„) fx 


logd„ 


^pn,tn 


- 1 


■ Af (Am, s^) 


3. Finite sample behaviour 

In this section, we illustrate our results by presenting some graphics (issued from an extensive study) corresponding 
to the comparison, in terms of bias and root mean squared error (RMSE), of our new estimator 7„ (defined in Q) 
with the existing estimator ^cs (defined in equation 0) issued from [^, for two classes of heavy-tailed distributions: 

• Burr(/ 3 ,T, A) with distribution function 1 — )^, for which the e.v.i. is 

• Frechet(7) with distribution function exp(—for which the e.v.i. is 7. 

Note that, in those simulations, we used the random threshold ^ (where 1 ^ A:„ < n) instead of a 

deterministic threshold in the definition of j„, and we also considered ki = k2 in the definition of 7 gs, which is 
out of the scope of Theorem 3 in (but the authors themselves restricted their simulations to this situation, which 
was then presented as more manageable and convenient). Note that making n vary did not provide notable findings, 
so we kept the number n of actual observation fixed. 

We simulated 2000 random samples of size n = 200 in 6 different situations : 3 choices of families of distributions 
(Burr truncated by another Burr, Frechet truncated by another Frechet, and Burr truncated by a Frechet) combined 
with 2 choices of truncation strength. This strength is measured by the ultimate probability a := ^(^((^^ of non¬ 
truncation in the tail (for a proof of this formula, see [ 3 ]), which is distinct from the overall p = P(X ^ Y) : two 
values were considered, a = 2/3 (for 71 = 1/4 and 72 = 1 / 2 , i.e. important truncation) and a = 8/9 (for 71 = 1/4 
and 72 = 2 , i.e. mild truncation). The results are contained in Figure[^ where bias and RMSE are plotted against 
different values of fc„, the number of excesses used. 

This section also contains graphics illustrating the behaviour of our extreme quantile estimator of Xp^ (again 

computed with the random threshold X*_f. ^ instead of (t„). Under the same simulation framework described above, 
we considered the estimation of the extreme quantile Xp^ with pn = 0 , 03 . Results are displayed in Figure]^ 

The main conclusion we can deduce from our intensive simulation study is that our estimator 7„ seems to behave 
systematically better (both in terms of bias and RMSE) than the existing estimator ^cs used with ki = k2, whatever 
the distributions and the value of a are (and changing the sample size yields the same conclusion). Nonetheless, 
the comparison may be a bit delicate since the properties of jgs are only proved when the two numbers fci and ^2 
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(a) Burr(10, 4,1) truncated by Burr(10, 2,1) 


60 
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60 
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(b) Burr(10,4,1) truncated by Burr(10,1,1/2) 



(c) Frechet(l/4) truncated by Frechet(l/2) 



(d) Frechet(l/4) truncated by Frechet(2) 



(e) Burr(10, 4,1) truncated by Frechet(l/2) (f) Burr(10, 4,1) truncated by Frechet(2) 


Figure 1: Comparison of bias and RMSE (respectively left and right in each subfigure) for 7 n (plain) and 7 gs (dashed) where 71 = 1/4, 
72 = 1/2 and a = 2/3 (important truncation) for subfigures (a),(c),(e), and where 71 = 1/4, 72 = 2 and a = 8/9 (mild truncation) for 
subfigures (b),(d),(f) 
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are quite distant from each other. On the other hand, the performance of our estimator clearly diminishes when 
the ultimate proportion of non-truncation a decreases (which is equivalent to 71 getting closer to 72, which notably 
increases the asymptotic variance of our estimator) but this phenomenon also holds (and to a greater extent) for 
7GS- According to our investigations, and unsurprisingly, a small value of p\ also implies a lesser performance. And 
concerning the bias, since our estimator of 71 is based on the same idea as the Hill estimator in the complete data 
setting, the relatively high bias observed is neither surprising nor unbearable ; and it is always lower than the bias 
of 7GS. 

Concerning our new extreme quantile estimator the finite sample behaviour seems quite satisfying, even 

if its performances depend on the value of Pn and of the truncation strength. 

4. Conclusion 

This paper addressed the problem of estimating tails (extreme value index 71 and extreme quantiles) of randomly 
right-truncated data, when both the truncated and the truncating variables are heavy-tailed. This framework was 
first considered in [^, where a first proposition of estimator of 71 was provided. We propose here an alternative 
approach, leading to an estimator of 71 which takes the form of a Lynden-Bell integral of some particular function, 
and is therefore a sort of natural version of the Hill estimator in this truncation context. Contrary to the situation of 
[ 5 ] (for which the choice of the numbers of upper order statistics ki and ^2 in the estimator 7^5 defined in Q could 
remain very delicate in practice), a single tuning parameter has to be determined (the threshold t„, or in practice 
the number of upper order statistics), and experimental results are very encouraging. 

Concerning the asymptotic normality result for our estimator, the restriction that the truncating variable has a 
heavier tail than the truncated variable seems to be unavoidable, and improving the performance in term of bias is 
an open problem, as is the extension of the approach to truncated data with non-negative extreme value index. 


5. Proofs of the results 


5 . 1 . Proof of Theorem^ 

We introduce the following important notations : first 




X? >t„ 


( 12 ) 


The variables Vi^n are independent and identically distributed and, using ( 2 ), we readily have E(Vi^„) = —^ ^”log(a:/f„)dP(a;), 

which converges to 71. Then we consider two (very close but different anyway) estimators of the cumulative hazard 
function A of A, A = — logF : for any t, let (for the first definition below, F„(t) is supposed > 0 though) 


A„(t) =-logF„(f) and A„(t) = ^ 


( 13 ) 


xr>t 


We will later approach An{tn)/F{tn) by where the i.i.d. variables are defined by 

V*>t„ .., ^ Htn) 


v = — 


_ with E(H' ) = ^ 

F(t„)C(Af) F{tn) 

Finally we set Wi^n = - ]E(Vi,n) and - E(t/(_„), as well as 

A„ = Fn{tn)/F{tn) and Vn = nH{tn) 


( 14 ) 


Before proceeding to the proof of Theorem]^ let us state some lemmas (completer bien sur les conditions/hypotheses...) 

Lemma 1. Under condition ([^, we have ~ 7n = op(un^^^). 

Under conditions Q and ([^, the sequence (A„) converges to 1 in probability. 
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Lemma 2. 
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(a) Burr(10, 4,1) truncated by Burr(10, 2,1) 



20 60 100 20 60 100 
k k 

(c) Frechet(l/4) truncated by Frechet(l/2) 



20 60 100 20 60 100 


(e) Burr(10, 4,1) truncated by Frechet(l/2) 


Figure 2: Bias and RMSE (respectively left and right in each su 
truncation) for subfigures (a),(c),(e), and where 71 = 1/4, 72 = 2 
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(b) Burr(10,4,1) truncated by Burr(10,1,1/2) 
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k k 

(d) Frechet(l/4) truncated by Frechet(2) 



20 60 100 20 60 100 


(f) Burr(10, 4,1) truncated by Frechet(2) 














Lemma 3. IfT = ma.x{X*\nCn{X*) = 1} and = {T ^ i„}, then, under condition (j^, we have 


F(i„) 




Op(l). 


Lemma 4. Under conditions Q and ([^, 


An(tn) A(tn^ 

W^) 


= v^TL„ + op(l). 


(15) 


For the next two lemmas, note that quantities and m have been defined in the statement of Theorem [^. 

Lemma 5. Under conditions (|^ and ([^, the sequences n, ^ and y^(lF„ — 7 ilF„) converge in dis¬ 

tribution to centered gaussian distributions of respective variances 2 p 7 i^/(l — 71 / 72 )^, p/(l — 71 / 72 ) o,nd s^. 

Lemma 6. Under conditions ([7| and (1^, we have -s/UnfElfjn) — 71 ) Am. 


Note that Lemma |^s a direct corollary of relation 0 and of Lemmas and Lemma is included in the 
proof of Theorem 1 in^. We will provide the proofs of the other lemmas in the next subsections. 


Let us now turn to the proof of Theorem We have, thanks to Lemmas and 


= V^(^n^7n - 7l) + op(l) = ( 7 „ - 71 ) - 7 i(A„ - 1)) + op(l). 


(16) 


We consider 


A„ 


1 = 


Fu{tn) - F{tn) Fn{tn) - F{tn) 


F{tn) F{tn) 

and we want to deal with this difference by introducing cumulative hazard functions (defined at the beginning of this 
section). But if there exists some data value Xf which is both greater than t„ and such that nCn{X*) = 1, then 
Fn{tn) = 0 and A„(t„) is undefined. In order to avoid this, we introduce the variable 

T = max{X*-,nCn{X*) = 1} 

for which |S] proved that P(T = miuisjn Xf) converges to 1. Therefore, if we set An = {T ^ t„}, then on we have 
Fnftn) > 0 on one hand, and on the other hand P(A(() ^ P(T 7 ^ minisjji X*) + P(minisj„ X* > tn), which tends to 
0. We can thus write, using the mean value theorem, 

exp(-A„(t„)) - exp(-A(t„)) F„(t„) - F(t„) 

— i ---..+ --^“^7-X- 


F{tn) 


F{tn) 


An{tn)-A{tn) F„(t„)-F(t„) 

= Sn - -^v, , -—• + -- , - ~^A^ 


F{tn) 


F{tn) 


where converges to 1 in probability, since both A„(t„) and A(t„) converge to 0. Therefore, using successively 
V{An) 0 and Lemmas ii and we can write 


Vh^(A„-l) = = fnlA„V^W'n + Or{l) 


F{tn) 

= ^/UnW'n Or{l). 

On the other hand, 

\A 7 /( 7 n - 71) = \FjfWn + Vr 7 /(E( 7 n) - 71) 

and consequently, combining relations and 0 with Lemmas and the theorem is proved : 

VFn{%-li) = A“^ I xA 7 /( 1 F„ - 7 i 1 F'„) + xA 7 /(E( 7 n) - 7 i) + 0 p(l)} + 0 p(l) A/'(Am, s^). 


(17) 











5.2. Proof of Theorem^ 


Recall that oo, and the notations A„ = (which satisfies (|l7l)) and Vn = nH{tn). We write 

Pn r [tn) 




^ - 1 = — 


1 = A'> 


tr. 


^71711 I 7^2 , .^3 \ 

' -^n ' ] ■> 


‘^Pn '^Pn ^‘^Pn 

where := — 1, ■= -^dff — 1 and := 1 — A“^". We are going to prove that both and are 

"£ 


ov{\ogdn/-s/vn), and that d'n —*■ (Am, s^). This will conclude the proof, since both A„ and dff tend to 

Let us first focus on . The mean value theorem yields 

■Tn = V^{% - 7i) exp(T;„), 


log dr 


where |if„| < |7n — 7 i|logc?„ and therefore if„ tends to 0 thanks to Theorem and assumption (111. The desired 
result for is then implied by Theorem]^ 

We now deal with T^. Recalling that F{x) = by definition of Xp^ we have 


rj^2 _ 


Ip ) 
Ipitn) 


-71 


1 


We use the following representation of Ip (see [7] page 1195) when p < 0 : 

If{x) = C (1 + pi~^g{x) + o{g{x))) , for x ^ +00. 

Hence 


{Xp^ ) 

Ipitn) 


= 1 - Pi 9(fn) 1 - 


9{Xp 


9{trr) 


+ Op(l) + o 


9ixpJ 

9{tn) 


But 9{xp^) / g{tn) tends to 0 because Xp^ftn tends to infinity and 


\9{XpJ/g{trr) - {Xp^/trrY^ \ ^ SUp |g(yt„)/p(t„) - ?/ 


0 . 


It follows that 1 ~ Pi ^P(^n)(l + Op(l))- Thus \lftF{XpY)Y fYu)) — 1 ^ c |(i?(xp^)/Zi?(t„) — 1|, for some 

constant c and then 

\rr2\ ^ -1 /- (, n 1 + Op( 1 ) 

1- 7 - H n < Cpi y/Vng{trr)— - - -. 

log dn log dn 

Assumption ^ and the fact that log dn tends to 0 conclude the proof for . 

For TY we use the mean value theorem to write 

r3 = 7„if-^"-i(A„ - 1 ), 

with K„ tending to 1. In view of (17) and Lemmaj^, we thus have (A„ — 1) = Ov{Vj/\ogdn = op(l) and then 
the desired neglibility of follows. 

5.3. Proof of Lemma [7] 

We have A„ 7 „ = % + Sn,i + Sna, with 


and 


1 I ^ Fn{Xt)-F{Xt) 

F{tn)n^^ Cn{Xt) 

1 1 " /I 1 


PYn) n 


log ( ^ ]^xf>u 
0^] 


Xf 


CniXf) C{Xt) 


log ( ^ 
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Let us show that both y/v^Sn^i and ■s/^Sn ,2 are op(l). On one hand, 

^ fVn sup \Fn{x) - F(x)|") sup JH(tn)V^ 

V / X*>t„ ) 


where := with 




•■“ ' F(i„) C{X>) “ V tn 

Using (© and © yields 

1 r” 1 1 r” 

log(x/t„)dF(a:) = (1 + op(l))=^ log(a:/t„)dU(a;), 

^ \Jn) Jtn ^ v^/ ^ 


which converges to 71 ; Markov inequality then yields y/H{tn)V^ = op(l). On the other hand, 

, , C{X*) 

WVnSn,2\ < sup . . 

X*>t„ ) 


Vn sup \Cr.{Xn-C{X*)\\ jH{tr,)Vl 

X*>t„ 


where ■.= ^ I!r=i^z^n with 


, /yf 

F(K)C^(Xn *0: 


X^>tr,- 


(18) 


(19) 


Using again ([^ and ([^, we have 

E(U,2„) = = p{l + op(l))=^ r log(a:/t„)dF(a:). 

F(tn)Jt^ F[x)G[x) F[tn)Jt„G[x) 

By Lemma ^ (where constant c\ is defined), it comes E(Uj^^) = (1 + op(l))= 7 ^ and Markov inequality then yields 

/=—- - rn rn n 

YiL(t„)U^ = Op ((F(t„)/G(t„))^/^) = op(l). Combining (18) and (19) with Lemma 7 ends the proof. 


5.^. Proof of Lemma^ 

Recall that T = max{X* ;nCn(X*) = 1} and that we previously saw that P(A„) ^ 1 when An = {T ^ t„}. 
Using the fact that 0 ^ — log(l — x) — x ^ for any 0 ^ a: < 1, and that, on A„, we have nGn{X*) > 2 for every 


X* > , we can write that 




< 21 . = 




F{tn) n^CliXt) 


Using Lemma we have 


E 


< 


G{tn) 1 V 


'\/Xn 

W^) n^C^iX*) ^ nFitn) n ^.G^Xf )' 




Noting Zn = ^'Zii=i^x*>t s-nd using (^2^ and (^3|, we have 


E{Zn) = I 


p dF{x) 
t„ F'^{x) G{x) 


r 

p(l + op(l))J 


’ dF(x) 


Via Lemma 


’t„ G(x) 

E ( ^1 Ff^^^Zn ) tends to 0 and therefore Z„ = op(l) by Markov’s inequality, which ends the 




proof of the lemma. 


10 








































5.5. Proof of Lemma 

For brevity, we only prove the third part of the lemma. First, using relation ([^ and Lemma (wherein the 
constants Cq = l/g, Ci = 71 /g^, C 2 = 271 ^/ 9 ^ are defined, with q = 1 — 71 / 72 ), it is easily seen that 


E(Fi.„) 

nvin) 

E(n'.n) 

E((n'.j") 

nvi,uVin) 


1 


:£ 


F{t^) 

^ A 


log 
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Introducing Ui^n = bFi,n — 7 ihFi „ and Sn = XiisSn Gi^n, we thus obtain (s^ is defined in the statement of the lemma) 

Var(17i,„) = Var(Vi,„ - 7 iV'{„) = ^^(1 + o(l)) 

and consequently yvn{Wn—"i\W'y = yunSnfn = s(l + o(l))S'„/Var(S'„), which converges in distribution to A/’(0, s^) 
as soon as Lyapunov’s condition holds. After some simplifications, Lyapunov’s condition becomes the existence of 
some (5 > 0 such that 

n-GyH{tyy+G^E{\Uiy^+y o. 

Proceeding as in [2, and noting that E(Vi_„) — 7 iE(V]' „) vanishes to 0, the double application of the inequality 
|a + ^ 2 ^+‘^(|ap+^ + | 6 p+‘^) shows that it suffices to prove the following, for some (5 > 0 : 


n-G^{H{tyy+G^E{\V\'^+y 0 for both V = Vi,„ and V = 

We prove this property for V = Vi^n, the proof for V = V{ ^ being very similar. We have 

E(|Fi,n|"+^) = p^+yF(ty)-^-^ 
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Mimicking the proof of Lemma stated in the appendix, and because S can be chosen arbitrary small (so that 
(1 + d)/j 2 remains lower than I/ 71 ), we can prove that 
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and therefore, since we assumed that nH{tn)^ co, the desired property (20) holds for V = : 


n-GyH{tyy+G^E{\Viy^+y ^ oii)n-GyH{tyy+GyF{ty)-^-^F{tyG ' \ty = oyynHityy^G q . 

5 .6. Proof of Lemma 
that E( 7 ^ 

Since F{y) = have 
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Recall that E( 7 „) = — ^ Ii"” y integration by parts and change of variables. 
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and using assumption ^ and Proposition 3.1 in [7], we can write 
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The result then follows from assumption Jol and the fact that ?/ ^hp^{y)dy = m. 


6. Appendix 

This appendix contains two lemmas : Lemma contains results which are proved elsewhere but are crucial for 
our proof, and which we thus restate here, whereas Lemmais a variant of a particular case of Lemma 2 in [^, and 
states essential equivalences for our proofs. 

Lemma 7. If tends to infinity with n, then 
(a) ^/hsup^^t^ \Fr,{x) - F{x)\ = Op(l). 

C(Xf) 


(b) supi^j. 


c„(x*) 


X* >tr,} = Or{!). 


(c) V^supi^,^„ { |C„(Af) - C{Xf)\ I Af > } = Op(l). 

Proof 

(a) is a consequence of point 6 page 176 in [TO], (b) is proved in (see lemma 5), following the ideas contained in 
[9]. Since Cn = F* — G* , where F* and G* are respectively the empirical distribution functions of F* and G*, 
(c) is a consequence of y^sup^-^g \F*{x) — F*(a;)| = Op(l) and |G'*(a;) — G*(a;)| = Op(l) (see [10] pages 

172-173). 

Lemma 8 . Under condition for any fc e N, as oo, 
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where Ck = 


'yi^kl 


( 1 - 71 / 72 )''+^ ■ 

Proof 

Let us note a = I /72 and /3 = I/ 71 , which satisfy 0 < a < /3 by assumption. We need to prove that the following 
quantity converges to Cfc (below, 5 > 0 is arbitrary small) 
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( 21 ) 


In the last line, we used Theorem 1.5.2 in [3] with the fact that x 1 -^ x~°‘~^/G{x) is regularly varying of order —6. 
It thus remains to prove that In,k{c() converges to Cfc (the same being true for In,k{a + ^)). We now introduce the 
notations : for 6 > 0 


Jk{0) = \og'"{y)y ® ^dy=-^ and Jn,k = log'"{y)y°‘ 


-i Fjytn) 
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dy. 


For any S e]0, /3 — a[, since the function x ^ x^ °F[x) is regularly varying of order —5, we have 
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We thus have, by integration by parts and the relation kJk-i{0) = 9Jk{9)^ 
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